ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
نویسندگان
چکیده
To exploit instruction level parallelism, it is important not only to execute multiple memory references per cycle, but also to reorder memory references, especially to execute loads before stores that precede them in the sequential instruction stream. To guarantee correctness of execution in such situations, memory reference addresses have to be disambiguated. This paper presents a novel hardware mechanism, called an Address Resolution Buffer (ARB), for performing dynamic reordering of memory references. The ARB supports the following features: (i) dynamic memory disambiguation in a decentralized manner, (ii) multiple memory references per cycle, (iii) out-of-order execution of memory references, (iv) unresolved loads and stores, (v) speculative loads and stores, and (vi) memory renaming. The paper presents the results of a simulation study that we conducted to verify the efficacy of the ARB for a superscalar processor. The paper also shows the ARB’s application in a multiscalar processor.
منابع مشابه
Franklin and Sohi : Arb - a Hardware Mechanism for Dynamic Reordering of Memory
To exploit instruction level parallelism, it is important not only to execute multiple memory references per cycle, but also to reorder memory references-especially to execute loads before stores that precede them in the sequential instruction stream. To guarantee correctness of execution in such situations, memory reference addresses have to be disambiguated. This paper presents a novel hardwa...
متن کاملMemory Barriers: a Hardware View for Software Hackers
So what possessed CPU designers to cause them to inflict memory barriers on poor unsuspecting SMP software designers? In short, because reordering memory references allows much better performance, and so memory barriers are needed to force ordering in things like synchronization primitives whose correct operation depends on ordered memory references. Getting a more detailed answer to this quest...
متن کاملDynamic Memory Disambiguation Using the Memory Connict Buuer
To exploit instruction level parallelism, compilers for VLIW and superscalar processors often employ static code scheduling. However, the available code reordering may be severely restricted due to ambiguous dependences between memory instructions. This paper introduces a simple hardware mechanism, referred to as the memory connict buuer, which facilitates static code scheduling in the presence...
متن کاملThe limits of a decoupled out-of-order superscalar architecture
This thesis presents a study into a technique for improving performance in outof-order superscalar architectures. It identifies three technological trends limiting superscalar performance; they are the increasing cost of a main memory access, control dependencies and the greater hardware complexity of out-of-order execution. Decoupling is a technique that can provide higher performance through ...
متن کاملDAISY: Dynamic Compilation for 100% Architectural Compatibility
Although VLIW architectures offer the advantages of simplicity of design and high issue rates, a major impediment to their use is that they are not compatible with the existing software base. We describe new simple hardware features for a VLIW machine we call DAISY (Dynamically Architected Instruction Set from Yorktown). DAISY is specifically intended to emulate existing architectures, so that ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Computers
دوره 45 شماره
صفحات -
تاریخ انتشار 1996